Velero简介
Velero 是一款可以安全的备份、恢复和迁移 Kubernetes 集群资源和持久卷等资源的备份恢复软件。
Velero 实现的 kubernetes 资源备份能力,可以轻松实现 Kubernetes 集群的数据备份和恢复、复制 kubernetes 集群资源到其他 kubernetes 集群或者快速复制生产环境到测试环境等功能,这种备份就类似于把资源的 yaml 文件进行整体备份,从而保障资源的完整性。
Velero 对存储的支持较好,可以支持很多种存储资源,比如 AWS S3、Azure Blob、Google Cloud Storage、Alibaba Cloud OSS、Swift、MinIO 等等.
Velero工作流程
Velero备份过程
- 本地 Velero 客户端发送备份指令。
- Kubernetes 集群内就会创建一个 Backup 对象。
- BackupController 监测 Backup 对象并开始备份过程。
- BackupController 会向 API Server 查询相关数据。
- BackupController 将查询到的数据备份到远端的对象存储。
Velero特性
Velero 目前包含以下特性:
- 支持 Kubernetes 集群数据备份和恢复
- 支持复制当前 Kubernetes 集群的资源到其它 Kubernetes 集群
- 支持复制生产环境到开发以及测试环境
Velero组件
Velero 组件一共分两部分,分别是服务端和客户端。
- 服务端:运行在你 Kubernetes 的集群中
- 客户端:是一些运行在本地的命令行的工具,需要已配置好 kubectl 及集群 kubeconfig 的机器上
Velero支持备份存储
- AWS S3 以及兼容 S3 的存储,比如:Minio
- Azure BloB 存储
- Google Cloud 存储
- 阿里云OSS
Velero适用场景
- 灾备场景:提供备份恢复k8s集群的能力
- 迁移场景:提供拷贝集群资源到其他集群的能力(复制同步开发,测试,生产环境的集群配置,简化环境配置)
Velero备份与etcd备份的区别
- 与 Etcd 备份相比,直接备份 Etcd 是将集群的全部资源备份起来。
- Velero 可以对 Kubernetes 集群内对象级别进行备份。
- 除了对 Kubernetes 集群进行整体备份外,Velero 还可以通过对 Type、Namespace、Label 等对象进行分类备份或者恢复。
注意:备份过程中创建的对象是不会被备份的。
Velero部署
如果是迁移,需要把Velero部署在两套不同集群的其他机器上,这个机器要能同时连接两套集群,如果用来备份,安装在控制节点即可
Velero version | Expected Kubernetes version compatibility | Tested on Kubernetes version |
---|---|---|
1.11 | 1.18-latest | 1.23.10, 1.24.9, 1.25.5, and 1.26.1 |
1.10 | 1.18-latest | 1.22.5, 1.23.8, 1.24.6 and 1.25.1 |
1.9 | 1.18-latest | 1.20.5, 1.21.2, 1.22.5, 1.23, and 1.24 |
1.8 | 1.18-latest |
wget https://github.com/vmware-tanzu/velero/releases/download/v1.9.1/velero-v1.9.1-linux-amd64.tar.gz
tar -xf velero-v1.9.1-linux-amd64.tar.gz -C /opt/
设置阿里云oss 权限
vim /opt/velero-v1.9.1-linux-amd64/credentials-velero
ALIBABA_CLOUD_ACCESS_KEY_ID=******
ALIBABA_CLOUD_ACCESS_KEY_SECRET=******
cp /opt/velero-v1.11.0-linux-amd64/velero /usr/bin/
开启命令补全
velero completion bash
安装velero到k8s集群当中:
BUCKET=zhangqifeng-dev
REGION=cn-shanghai
cd /opt/velero-v1.9.1-linux-amd64/
velero install \
--provider alibabacloud \
--image registry.$REGION.aliyuncs.com/acs/velero:1.4.2-2b9dce65-aliyun \
--bucket $BUCKET \
--secret-file ./credentials-velero \
--use-volume-snapshots=false \
--backup-location-config region=$REGION \
--use-restic \
--plugins registry.$REGION.aliyuncs.com/acs/velero-plugin-alibabacloud:v1.0.0-2d33b89 \
--wait
#这些参数只在velero-1.9.1版本有效,velero版本再高不一定有效
如果您想使用内部 oss 端点,您可以添加参数:
--backup-location-config region=$REGION,network=internal
如果要使用oss前缀存储备份文件,可以添加参数:
--prefix <your oss bucket prefix>
验证:
kubectl get all -n velero
NAME READY STATUS RESTARTS AGE
pod/restic-ghwhr 1/1 Running 0 5m1s
pod/restic-rbc5c 1/1 Running 0 5m1s
pod/velero-75944b59fb-lqvpq 1/1 Running 0 5m1s
NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE
daemonset.apps/restic 2 2 2 2 2 <none> 5m1s
NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/velero 1/1 1 1 5m1s
NAME DESIRED CURRENT READY AGE
replicaset.apps/velero-75944b59fb 1 1 1 5m1s
卸载:
kubectl delete namespace/velero clusterrolebinding/velero
kubectl delete crds -l component=velero
velero命令介绍
velero create backup --help
Create a backup
Usage:
velero create backup NAME [flags]
Examples:
# Create a backup containing all resources.
velero backup create backup1
# Create a backup including only the nginx namespace.
velero backup create nginx-backup --include-namespaces nginx
# Create a backup excluding the velero and default namespaces.
velero backup create backup2 --exclude-namespaces velero,default
# Create a backup based on a schedule named daily-backup.
velero backup create --from-schedule daily-backup
# View the YAML for a backup that doesn't snapshot volumes, without sending it to the server.
velero backup create backup3 --snapshot-volumes=false -o yaml
# Wait for a backup to complete before returning from the command.
velero backup create backup4 --wait
# 剔除 namespace
--exclude-namespaces stringArray namespaces to exclude from the backup
# 剔除资源类型
--exclude-resources stringArray resources to exclude from the backup, formatted as resource.group, such as storageclasses.storage.k8s.io
# 包含集群资源类型
--include-cluster-resources optionalBool[=true] include cluster-scoped resources in the backup
# 包含 namespace
--include-namespaces stringArray namespaces to include in the backup (use '*' for all namespaces) (default *)
# 包含 namespace 资源类型
--include-resources stringArray resources to include in the backup, formatted as resource.group, such as storageclasses.storage.k8s.io (use '*' for all resources)
# 给这个备份加上标签
--labels mapStringString labels to apply to the backup
-o, --output string Output display format. For create commands, display the object but do not send it to the server. Valid formats are 'table', 'json', and 'yaml'. 'table' is not valid for the install command.
# 对指定标签的资源进行备份
-l, --selector labelSelector only back up resources matching this label selector (default <none>)
# 对 PV 创建快照
--snapshot-volumes optionalBool[=true] take snapshots of PersistentVolumes as part of the backup
# 指定备份的位置
--storage-location string location in which to store the backup
# 备份数据多久删掉
--ttl duration how long before the backup can be garbage collected (default 720h0m0s)
# 指定快照的位置,也就是哪一个公有云驱动
--volume-snapshot-locations strings list of locations (at most one per provider) where volume snapshots should be stored
使用velero实现k8s资源对象备份
创建备份
velero backup create namespace-default-backup --include-namespaces default
Backup request "namespace-default-backup" submitted successfully.
Run `velero backup describe namespace-default-backup` or `velero backup logs namespace-default-backup` for more details.
查看备份
velero backup describe namespace-default-backup
Name: namespace-default-backup
Namespace: velero
Labels: velero.io/storage-location=default
Annotations: velero.io/source-cluster-k8s-gitversion=v1.23.9
velero.io/source-cluster-k8s-major-version=1
velero.io/source-cluster-k8s-minor-version=23
Phase: Completed
Errors: 0
Warnings: 0
Namespaces:
Included: default
Excluded: <none>
Resources:
Included: *
Excluded: <none>
Cluster-scoped: auto
Label selector: <none>
Storage Location: default
Velero-Native Snapshot PVs: auto
TTL: 720h0m0s
Hooks: <none>
Backup Format Version: 1.1.0
Started: 2025-05-16 18:05:58 +0800 CST
Completed: 2025-05-16 18:05:59 +0800 CST
Expiration: 2025-06-15 18:05:58 +0800 CST
Total items to be backed up: 23
Items backed up: 23
Velero-Native Snapshots: <none included>
查看备份位置
velero backup-location get
NAME PROVIDER BUCKET/PREFIX PHASE LAST VALIDATED ACCESS MODE DEFAULT
default alibabacloud zhangqifeng-dev Unknown Unknown ReadWrite true
查看备份文件
kubectl get backups.velero.io -n velero
NAME AGE
namespace-default-backup 75s
使用velero实现对K8S资源对象进行恢复
原资源:
kubectl get pod
NAME READY STATUS RESTARTS AGE
busybox1-857448d9ff-shdfz 1/1 Running 3 (145m ago) 8d
busybox2-5c8f48d797-kk8mx 1/1 Running 3 (145m ago) 8d
busybox3-c997b9cc4-tlvrq 1/1 Running 2 (3d2h ago) 8d
nginx-deployment-6b9d659f5f-qbchs 1/1 Running 2 (145m ago) 8d
[root@node01 ~]# kubectl get all
NAME READY STATUS RESTARTS AGE
pod/busybox1-857448d9ff-shdfz 1/1 Running 3 (145m ago) 8d
pod/busybox2-5c8f48d797-kk8mx 1/1 Running 3 (145m ago) 8d
pod/busybox3-c997b9cc4-tlvrq 1/1 Running 2 (3d2h ago) 8d
pod/nginx-deployment-6b9d659f5f-qbchs 1/1 Running 2 (145m ago) 8d
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/kubernetes ClusterIP 10.96.0.1 <none> 443/TCP 8d
service/nginx-deployment NodePort 10.107.254.44 <none> 80:30008/TCP 8d
NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/busybox1 1/1 1 1 8d
deployment.apps/busybox2 1/1 1 1 8d
deployment.apps/busybox3 1/1 1 1 8d
deployment.apps/nginx-deployment 1/1 1 1 8d
NAME DESIRED CURRENT READY AGE
replicaset.apps/busybox1-857448d9ff 1 1 1 8d
replicaset.apps/busybox2-5c8f48d797 1 1 1 8d
replicaset.apps/busybox3-c997b9cc4 1 1 1 8d
replicaset.apps/nginx-deployment-6b9d659f5f 1 1 1 8d
进行删除测试:
kubectl delete all --all
恢复:
velero restore create --from-backup namespace-default-backup --wait
验证:
kubectl get all
NAME READY STATUS RESTARTS AGE
pod/busybox1-857448d9ff-shdfz 1/1 Running 0 106s
pod/busybox2-7bbcb84796-7zk2n 1/1 Running 0 106s
pod/busybox3-5774857479-crmsb 1/1 Running 0 106s
pod/nginx-deployment-6b9d659f5f-qbchs 1/1 Running 0 106s
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/kubernetes ClusterIP 10.96.0.1 <none> 443/TCP 3m21s
service/nginx-deployment NodePort 10.97.108.66 <none> 80:32704/TCP 106s
NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/busybox1 1/1 1 1 106s
deployment.apps/busybox2 1/1 1 1 106s
deployment.apps/busybox3 1/1 1 1 106s
deployment.apps/nginx-deployment 1/1 1 1 106s
NAME DESIRED CURRENT READY AGE
replicaset.apps/busybox1-857448d9ff 1 1 1 106s
replicaset.apps/busybox2-5c8f48d797 0 0 0 106s
replicaset.apps/busybox2-7bbcb84796 1 1 1 106s
replicaset.apps/busybox3-5774857479 1 1 1 106s
replicaset.apps/busybox3-c997b9cc4 0 0 0 106s
replicaset.apps/nginx-deployment-6b9d659f5f 1 1 1 106s
周期性备份任务
# Create a backup every 6 hours
velero create schedule NAME --schedule="0 */6 * * *"
# Create a backup every 6 hours with the @every notation
velero create schedule NAME --schedule="@every 6h"
# Create a daily backup of the web namespace
velero create schedule NAME --schedule="@every 24h" --include-namespaces web
# Create a weekly backup, each living for 90 days (2160 hours)
velero create schedule NAME --schedule="@every 168h" --ttl 2160h0m0s
# 每日对anchnet-devops-dev/anchnet-devops-test/anchnet-devops-prod/xxxxx-devops-common-test 名称空间进行备份
velero create schedule anchnet-devops-dev --schedule="@every 24h" --include-namespaces xxxxx-devops-dev
velero create schedule anchnet-devops-test --schedule="@every 24h" --include-namespaces xxxxx-devops-test
velero create schedule anchnet-devops-prod --schedule="@every 24h" --include-namespaces xxxxx-devops-prod
velero create schedule anchnet-devops-common-test --schedule="@every 24h" --include-namespaces xxxxx-devops-common-test
案例:
velero create schedule nginx-backups --schedule="0 */1 * * *" --include-namespaces nginx-example
Schedule "nginx-backups" created successfully.
velero get schedules
NAME STATUS CREATED SCHEDULE BACKUP TTL LAST BACKUP SELECTOR PAUSED
nginx-backups Enabled 2023-06-09 10:36:08 +0800 CST 0 */1 * * * 0s n/a <none> false
注意事项:
- 在velero备份的时候,备份过程中创建的对象是不会被备份的。
- velero restore 恢复不会覆盖已有的资源,只恢复当前集群中不存在的资源。已有的资源不会回滚到之前的版本,如需要回滚,需在restore之前提前删除现有的资源。
- velero也可作为一个crontjob来运行,定期备份数据。
在其它K8S集群中部署velero及恢复应用
把其他k8s集群的/root/.kube复制过来,连接上要恢复的集群:
kubectl get node
NAME STATUS ROLES AGE VERSION
master01 Ready control-plane,master 27h v1.23.9
master02 Ready control-plane,master 27h v1.23.9
master03 Ready control-plane,master 27h v1.23.9
worker01 Ready <none> 27h v1.23.9
worker02 Ready <none> 27h v1.23.9
cd /opt/velero-v1.9.1-linux-amd64/
velero install \
--provider alibabacloud \
--image registry.$REGION.aliyuncs.com/acs/velero:1.4.2-2b9dce65-aliyun \
--bucket $BUCKET \
--secret-file ./credentials-velero \
--use-volume-snapshots=false \
--backup-location-config region=$REGION \
--use-restic \
--plugins registry.$REGION.aliyuncs.com/acs/velero-plugin-alibabacloud:v1.0.0-2d33b89 \
--wait
velero restore create --from-backup namespace-default-backup --wait
kubectl get all
NAME READY STATUS RESTARTS AGE
pod/busybox1-857448d9ff-shdfz 1/1 Running 0 6m34s
pod/busybox2-77b7d67466-jkwmj 1/1 Running 0 66s
pod/busybox3-6474c748cc-9dhw7 1/1 Running 0 41s
pod/nginx-deployment-6b9d659f5f-qbchs 1/1 Running 0 6m34s
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/kubernetes ClusterIP 10.96.0.1 <none> 443/TCP 7m37s
service/nginx-deployment NodePort 10.110.228.5 <none> 80:30138/TCP 6m33s
NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/busybox1 1/1 1 1 6m33s
deployment.apps/busybox2 1/1 1 1 6m33s
deployment.apps/busybox3 1/1 1 1 6m33s
deployment.apps/nginx-deployment 1/1 1 1 6m33s
NAME DESIRED CURRENT READY AGE
replicaset.apps/busybox1-857448d9ff 1 1 1 6m34s
replicaset.apps/busybox2-5c8f48d797 0 0 0 6m33s
replicaset.apps/busybox2-77b7d67466 1 1 1 66s
replicaset.apps/busybox2-7bbcb84796 0 0 0 6m33s
replicaset.apps/busybox3-5774857479 0 0 0 6m33s
replicaset.apps/busybox3-6474c748cc 1 1 1 41s
replicaset.apps/busybox3-c997b9cc4 0 0 0 6m33s
replicaset.apps/nginx-deployment-6b9d659f5f 1 1 1 6m33s
[root@node01 velero-v1.9.1-linux-amd64]#
可以看到已经迁移了过来,请注意是只迁移了k8s资源对象。数据并没迁移,需要另外同步数据